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ABSTRACT 


The  synchronous  dataflow  (SDF)  programming  paradigm  has  been  used  extensively  in 
design  environments  for  multirate  signal  processing  applications.  In  this  paradigm,  the  repetition 
of  computations  is  specified  by  the  relative  rates  at  which  the  computations  consume  and  produce 
data.  This  implicit  specification  of  iteration  allows  a  compiler  to  easily  explore  alternative  nested 
loop  structures  for  the  target  code  with  respect  to  their  effects  on  code  size,  buffering  require¬ 
ments  and  throughput.  In  this  paper,  we  develop  important  relationships  between  the  SDF 
description  of  an  algorithm  and  the  range  of  looping  structures  offered  by  this  description,  and  we 
discuss  how  to  improve  code  efficiency  by  applying  these  relationships. 

1  Introduction 


Synchronous  dataflow  (SDF)  is  a  restricted  form  of  the  dataflow  model  of  computation 
[5].  In  the  dataflow  model,  a  program  is  represented  as  a  directed  graph.  The  nodes  of  the  graph, 
also  called  actors,  represent  computations  and  the  arcs  represent  data  paths  between  computa¬ 
tions.  In  SDF  [15],  each  node  consumes  a  fixed  number  of  data  items,  called  tokens  or  samples, 
per  invocation  and  produces  a  fixed  number  of  output  samples  per  invocation.  Figure  1  shows  an 
SDF  graph  that  has  three  actors  A,  B,  and  C.  Each  arc  is  annotated  with  the  number  of  samples 
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Introduction 


produced  by  its  source  actor  and  the  number  of  samples  consumed  by  its  sink  actor.  The  “D”  on 
the  arc  between  B  and  C  represents  a  unit  delay ,  which  can  be  viewed  as  an  initial  sample  that  is 
queued  on  the  arc.  SDF  and  related  models  have  been  studied  extensively  in  the  context  of  synthe¬ 
sizing  assembly  code  for  signal  processing  applications,  for  example  [7,  8,  9,  10,  17,  18,  19,  20]. 

In  SDF,  iteration  is  defined  as  the  repetition  induced  when  the  number  of  samples  pro¬ 
duced  on  an  arc  (per  invocation  of  the  source  actor)  does  not  match  the  number  of  samples  con¬ 
sumed  (per  sink  invocation)  [12].  For  example,  in  figure  1,  actor  B  must  be  invoked  two  times  for 
every  invocation  of  A.  Multirate  applications  often  involve  a  large  amount  of  iteration  and  thus 
subroutine  calls  must  be  used  extensively,  code  must  be  replicated,  or  loops  must  be  organized  in 
the  target  program.  The  use  of  subroutine  calls  to  implement  repetition  may  reduce  throughput 
significantly,  however,  particularly  for  graphs  involving  small  granularity.  On  the  other  hand,  we 
have  found  that  code  duplication  can  quickly  exhaust  on-chip  program  memory  [11].  As  an  alter¬ 
native,  we  examine  the  problem  of  arranging  loops  in  the  target  code. 

In  [11],  How  demonstrated  that  by  clustering  connected  subgraphs  that  operate  at  the  same 
repetition-rate,  and  scheduling  these  consolidated  subsystems  each  as  a  single  unit,  we  can  often 
synthesize  loops  effectively.  This  technique  was  extended  in  [3]  to  cluster  across  repetition-rate 
changes  and  to  take  into  account  the  minimization  of  buffering  requirements.  Although  these 
techniques  proved  effective  over  a  large  range  of  applications,  they  do  not  always  yield  the  most 
compact  schedule  for  an  SDF  graph  [2]. 

In  this  paper  we  define  a  simple  optimality  criterion  for  the  synthesis  of  compact  loop- 
structures  from  an  SDF  graph.  The  criterion  is  based  on  the  looped  schedule  notation  introduced 
in  [3],  in  which  loops  in  a  schedule  are  represented  by  parenthesized  terms  of  the  form  in  M-,  M2 
...  Mk),  where  n  is  a  positive  integer,  and  each  Mj  represents  an  SDF  actor  or  another  (nested) 
loop.  For  the  graph  in  figure  1,  for  example,  the  looped  schedule  A(2  BC)  specifies  the  firing 


Fig.  1.  A  simple  SDF  graph.  Each  arc  is  annotated  with  the  number  of  samples  produced  by  its 
source  and  the  number  of  samples  consumed  by  its  sink.  The  “D”  designates  a  unit  delay. 
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sequence  ABCBC.  Using  this  notation,  we  can  define  an  optimally-compact  looped  schedule  as 
one  that  contains  only  one  lexical  appearance  of  each  actor  in  the  SDF  graph.  We  call  such  an 
“optimal'’  looped  schedule  a  single  appearance  schedule.  More  precisely,  a  single  appearance 
schedule  is  a  looped  schedule  in  which  no  actor  appears  more  than  once.  For  example  the  looped 
schedule  CA(2  B)C  for  figure  1  is  not  a  single  appearance  schedule  since  C  appears  twice.  Thus, 
either  C  must  be  implemented  with  a  subroutine,  or  we  must  insert  two  versions  of  C’s  code  block 
into  the  synthesized  code.  In  the  schedule  A(2  CB)  however,  no  actor  appears  more  than  once,  so 
it  is  a  single  appearance  schedule;  thus  it  allows  in-line  code  generation  without  a  code-size  pen¬ 
alty. 

Our  observations  suggest  that  we  can  construct  single  appearance  schedules  for  most  prac¬ 
tical  SDF  graphs  [2].  In  this  paper,  we  formally  develop  transformations  that  can  be  applied  to 
single  appearance  schedules  to  improve  the  efficiency  of  the  target  code.  We  also  determine  nec¬ 
essary  and  sufficient  conditions  for  an  SDF  graph  to  have  a  single  appearance  schedule.  These 
conditions  were  developed  independently,  in  a  different  form,  by  Ritz  et  al.  [20],  although  their 
application  of  the  condition  is  quite  different  to  ours.  Ritz  et  al.  discuss  single  appearance  sched¬ 
ules  in  the  context  of  minimum  activation  schedules,  which  minimize  the  number  of  “context- 
switches”  between  actors.  For  example,  in  the  looped  schedule  A(2  CB)  for  figure  1,  the  invoca¬ 
tions  of  B  and  C  are  interleaved,  and  thus  a  separate  activation  is  required  for  each  invocation  — 
5  total  activations  are  required.  On  the  other  hand,  the  schedule  A(2  B)(2  C)  requires  only  three 
activations,  one  for  each  actor.  In  the  objectives  of  [20],  the  latter  schedule  is  preferable,  because 
in  that  code-generation  framework,  there  is  a  large  overhead  involved  with  each  activation.  With 
effective  register  allocation  and  instruction  scheduling,  such  overhead  can  often  be  avoided,  how¬ 
ever,  as  [18]  demonstrates.  Thus,  we  prefer  the  former  schedule,  which  has  less  looping  overhead 
and  requires  less  memory  for  buffering. 

Our  focus  has  been  on  creating  a  general  framework  for  developing  scheduling  algorithms 
that  provably  generate  single  appearance  schedules  when  possible,  and  that  incorporate  other 
scheduling  objectives,  such  as  the  minimization  of  buffering  requirements,  in  a  manner  that  is 
guaranteed  not  to  interfere  with  code  compaction  goals.  The  framework  modularizes  different 
parts  of  the  scheduling  process,  and  the  compiler  developer  has  freedom  to  experiment  with  the 
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component  modules,  while  the  framework  guarantees  that  the  interaction  of  the  modules  does  not 
impede  code  size  minimization  goals.  We  have  applied  conditions  for  the  existence  of  a  single 
appearance  schedule  to  define  our  scheduling  framework.  Due  to  space  limitations,  we  do  not 
elaborate  further  on  this  scheduling  framework  in  this  paper;  instead,  we  refer  the  reader  to  [2] . 

We  begin  with  a  review  of  the  SDF  model  of  computation  and  the  terminology  associated 
with  looped  schedules  for  SDF  graphs.  SDF  principles  were  introduced  [13]  in  terms  of  connected 
graphs.  However,  for  developing  scheduling  algorithms  it  is  useful  to  consider  non-connected 
graphs  as  well,  so  in  section  3  we  extend  SDF  principles  to  non-connected  SDF  graphs.  In  sec¬ 
tions  4  and  5,  we  discuss  a  schedule  transformation  called  factoring ,  which  can  produce  large 
reductions  in  the  amount  of  memory  required  for  buffering.  Finally,  in  section  6,  we  develop  con¬ 
ditions  for  the  existence  of  a  single  appearance  schedule,  and  we  discuss  the  application  of  these 
conditions  to  synthesizing  single  appearance  schedules  whenever  they  exist.  The  sections  form  a 
linear  dependence  chain  —  each  section  depends  on  the  previous  ones.  For  reference,  a  summary 
of  terminology  and  notation  can  be  found  in  the  glossary  at  the  end  of  the  paper. 

2  Background 


2.1  Synchronous  Dataflow 

An  SDF  program  is  normally  translated  into  a  loop,  where  each  iteration  of  the  loop  exe¬ 
cutes  one  cycle  of  a  periodic  schedule  for  the  graph.  In  this  section  we  summarize  important  prop¬ 
erties  of  periodic  schedules. 

For  an  SDF  graph  G,  we  denote  the  set  of  nodes  in  G  by  N( G)  and  the  set  of  arcs  in  G  by 
A(G).  For  an  SDF  arc  a,  we  let  source^ a)  and  sinki a)  denote  the  nodes  at  the  source  and  sink  of  a; 
we  let  p( a)  denote  the  number  of  samples  produced  by  source(a),  c{ a)  denote  the  number  of  sam¬ 
ples  consumed  by  sink( a),  and  we  denote  the  delay  on  a  by  delay(a).We  define  a  subgraph  of  G 
to  be  that  SDF  graph  formed  by  any  Z  c;  MG)  together  with  the  set  of  arcs  { a  e  A(G)  I  source^ a), 
sink(  a)  eZj.  We  denote  the  subgraph  associated  with  the  subset  of  nodes  Z  by  subgraph^  Z,  G);  if 
G  is  understood,  we  may  simply  write  subgraph^ Z). 
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We  can  think  of  each  arc  in  G  as  having  a  FIFO  queue  that  buffers  the  tokens  that  pass 
through  the  arc.  Each  FIFO  contains  an  initial  number  of  samples  equal  to  the  delay  on  the  associ¬ 
ated  arc.  Firing  a  node  in  G  corresponds  to  removing  c(a)  tokens  from  the  head  of  the  FIFO  for 
each  input  arc  a,  and  appending  p{\ 3)  tokens  to  the  FIFO  for  each  output  arc  (3.  After  a  sequence  of 
0  or  more  firings,  we  say  that  a  node  is  fireable  if  there  are  enough  tokens  on  each  input  FIFO  to 
fire  the  node.  An  admissable  schedule  for  G  is  a  sequence  S  =  Si  S2  . . .  SN  of  nodes  in  G  such  that 
each  Sj  is  fireable  immediately  after  S1?  S2,  ...,  SM  have  fired  in  succession. 

If  some  Sj  is  not  fireable  immediately  after  its  antecedents  have  fired,  then  there  is  least 
one  arc  a  such  that  (1)  sink( a)  =  Sj,  and  (2)  the  FIFO  associated  with  sink( a)  contains  less  than 
c(oc)  tokens  just  prior  to  the  /th  firing  in  S.  For  each  such  a,  we  say  that  S  terminates  on  a  at  firing 
Sj.  Clearly  then,  S  is  admissable  if  and  only  if  it  does  not  terminate  on  any  arc  a. 

We  say  that  a  schedule  S  is  a  periodic  schedule  if  it  invokes  each  node  at  least  once  and 
produces  no  net  change  in  the  number  of  tokens  on  a  FIFO  —  for  each  arc  a,  (the  number  of  times 
source( a)  is  fired  in  S)  x  p( a)  =  (the  number  of  times  sink( a)  is  fired  in  S)  x  c(a).  A  valid  sched¬ 
ule  is  a  schedule  that  is  both  periodic  and  admissable.  For  a  given  schedule,  we  denote  the  zth  fir¬ 
ing,  or  invocation,  of  actor  N  by  Ni;  and  we  call  i  the  invocation  number  of  Nj. 

In  [14],  it  is  shown  that  for  each  connected  SDF  graph  G,  there  is  a  unique  minimum  num¬ 
ber  of  times  that  each  node  needs  to  be  invoked  in  a  periodic  schedule.  We  specify  these  minimum 
numbers  of  firings  by  a  vector  of  positive  integers  qG,  which  is  indexed  by  the  nodes  in  G,  and  we 
denote  the  component  of  qG  corresponding  to  a  node  N  by  qG(N).  Corresponding  to  each  periodic 
schedule  S,  there  is  a  positive  integer  J(S)  called  the  blocking  factor  of  S,  such  that  S  invokes  each 
N  e  MG)  exactly  /(S)qG(N)  times.  We  call  qG  the  repetitions  vector  of  G.  If  G  is  understood 
from  context,  we  may  refer  to  qG  simply  as  q.  From  the  definition  of  a  repetitions  vector,  we 
obtain  the  following  three  facts: 

Fact  1 :  The  components  of  a  repetitions  vector  are  collectively  coprime;  that  is,  they  have  no 
common  divisor  that  exceeds  unity. 
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Fact  2:  The  balance  equation  q(source(a ))  x  p( a)  =  qisinkta))  x  c( a)  is  satisfied  for  each  arc  a 
in  G.  Also,  any  positive-integer  vector  that  satisfies  the  balance  equations  is  a  positive-integer 
multiple  of  the  repetitions  vector. 

Fact  3:  Suppose  that  G  is  a  connected  SDF  graph  and  S  is  an  admis  sable  schedule  for  G.  If  there 
is  a  positive  integer  J0  such  that  S  invokes  each  N  e  MG)  exactly  /0q(N)  times,  then  S  is  a  valid 
schedule. 

Thus  a  positive-integer  vector  indexed  by  the  nodes  in  a  connected  SDF  graph  is  the  repe¬ 
titions  vector  for  that  SDF  graph  iff  it  satisfies  the  balance  equations  and  its  components  are 
coprime. 

Given  an  SDF  graph  G,  we  say  that  G  is  strongly  connected  if  for  any  pair  of  distinct 
nodes  A,  B  in  G,  there  is  a  directed  path  from  A  to  B  and  a  directed  path  from  B  to  A.  We  say  that 
a  strongly  connected  SDF  graph  is  nontrivial  if  it  contains  more  than  one  node.  Also,  we  say  that 
a  subset  Z  of  nodes  in  G  is  strongly  connected  if  subgraph{ Z,  G)  is  strongly  connected.  Finally,  a 
strongly  connected  component  of  G  is  a  strongly  connected  subset  of  MG)  such  that  no  strongly 
connected  subset  of  N( G)  properly  contains  Z. 

Although  there  is  no  theoretical  impediment  to  infinite  SDF  graphs,  we  currently  do  not 
have  any  practical  use  for  them,  so  in  this  paper,  we  deal  only  with  SDF  graphs  that  have  a  finite 
number  of  nodes  and  arcs.  Also,  unless  otherwise  stated,  we  deal  only  with  SDF  graphs  for  which 
a  valid  schedule  exists. 

2.2  Looped  Schedule  Terminology 

Definition  1 :  A  schedule  loop  is  a  parenthesized  term  of  the  form  (n  T-,  T2  ...  Tm),  where  n  is  a 
positive  integer  and  each  Tj  represents  an  SDF  node  or  another  schedule  loop,  (n  Ti  T2  . . .  Tm)  rep¬ 
resents  the  successive  repetition  n  times  of  the  firing  sequence  Ti  T2  ...  Tm.  If  L  =  in  Ti  T2  ...  Tm) 
is  a  schedule  loop,  we  say  that  n  is  the  iteration  count  of  L,  each  T,  is  an  iterand  of  L,  and  T1  T2  . . . 
Tm  constitutes  the  body  of  L.  A  looped  schedule  is  a  sequence  V-,  V2  ...  Vk,  where  each  Vj  is 
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either  an  actor  or  a  schedule  loop.  Since  a  looped  schedule  is  usually  executed  repeatedly,  we 
refer  to  each  V,  as  an  iterand  of  the  associated  looped  schedule. 

When  referring  to  a  looped  schedule,  we  often  omit  the  “looped”  qualification  if  it  is 
understood  from  context;  similarly,  we  may  refer  to  a  schedule  loop  simply  as  a  “loop”.  Given  a 
looped  schedule  S,  we  refer  to  any  contiguous  sequence  of  actor  appearances  and  schedule  loops 
in  S  as  a  subschedule  of  S.  More  formally,  a  subschedule  of  S  is  any  sequence  of  successive 
iterands  of  S  or  any  sequence  of  successive  iterands  of  a  schedule  loop  contained  in  S.  For  exam¬ 
ple,  the  schedules  (3  AB)C  and  (2  B(3  AB)C)A  are  both  subschedules  of  A(2  B(3  AB)C)A(2  B), 
whereas  (3  AB)CA  is  not.  By  this  definition,  every  schedule  loop  in  S  is  a  subschedule  of  S.  If  the 
same  firing  sequence  appears  in  more  than  one  place  in  a  schedule,  we  distinguish  each  instance 
as  a  separate  subschedule.  For  example,  in  (3  A(2  BC)D(2  BC)),  “(2  BC)”  appears  twice,  and 
these  correspond  to  two  distinct  subschedules.  In  this  case,  the  content  of  a  subschedule  is  not  suf¬ 
ficient  to  specify  it  —  we  must  also  specify  the  lexical  position,  as  in  “the  second  appearance  of  (2 
BC)”. 

Given  a  looped  schedule  S  and  an  actor  N  that  appears  in  S,  we  define  inv( N,  S)  to  be  the 
number  of  times  that  S  invokes  N.  Similarly,  if  S0  is  a  subschedule,  we  define  inv{ S0,  S)  to  be  the 
number  of  times  that  S  invokes  S0.  For  example,  if  S  =  A(2  (3  BA)C)BA(2  B),  then  mv(B,  S)  =  9, 
mv((3  BA),  S)  =  2,  and  mv(first  appearance  of  BA,  S)  =  6.  Also,  we  refer  to  the  schedule  that  a 
looped  schedule  S  represents  as  the  firing  sequence  generated  by  S.  For  example,  the  firing 
sequence  generated  by  A(2  (3  BA)C)BA(2  B)  is  ABABABACBABABACBABB.  When  there  is 
no  ambiguity,  we  occasionally  do  not  distinguish  between  a  looped  schedule  and  the  firing 
sequence  that  it  generates. 

Finally,  given  an  SDF  graph  G,  an  arc  a  in  G,  a  looped  schedule  S  for  G,  and  a  nonnega¬ 
tive  integer  i,  we  define  P( a,  i,  S)  to  denote  the  number  of  firings  of  source(a)  that  precede  the  /'th 
invocation  of  sink( a)  in  S.  For  example,  consider  the  SDF  graph  in  figure  1  and  let  a  denote  the 
arc  from  B  to  C.  Then  P( a,  2,  A(2  BC))  =  2,  the  number  of  firings  of  B  that  precede  invocation  C2 
in  the  firing  sequence  ABCBC. 
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3  Non-connected  SDF  Graphs 

The  fundamentals  of  SDF  were  introduced  in  terms  of  connected  SDF  graphs  [13,  15].  In 
this  section,  we  extend  some  basic  principles  of  SDF  to  non-connected  SDF  graphs.  We  begin 
with  two  definitions. 

Definition  2:  Suppose  that  G  is  an  SDF  graph,  M  is  any  subset  of  nodes  in  G,  and  Ms  c  M.  We 
say  that  Ms  is  a  maximal  connected  subset  of  M  if  subgraph( Ms,  G)  is  connected,  and  no  subset  of 
M  that  properly  contains  Ms  induces  a  connected  subgraph  in  G.  Every  subset  of  nodes  in  an  SDF 
graph  has  a  unique  partition  into  one  or  more  maximal  connected  subsets.  For  example  in  figure 
2,  the  subset  of  nodes  [A,  B,  C,  E,  G,  H]  has  three  maximal  connected  subsets:  [A,  H],  [B,  E,  C] 
and  { G } .  If  Ms  is  a  maximal  connected  subset  of  MG),  then  we  say  that  subgraph{ Ms,  G)  is  a 


Fig.  2.  An  example  used  to  illustrate  maximally  connected  subsets.  The  direction  and  the 
SDF  parameters  for  each  arc  are  not  shown  because  they  are  not  relevant  to  connected¬ 
ness. 


maximal  connected  subgraph  of  G.  We  denote  the  set  of  maximal  connected  subgraphs  in  G  by 
max_connected( G).  Thus,  for  figure  3,  max_connected( G)  =  {subgraph{{ A,  B}),  subgraph{{ C, 
D})}. 

Definition  3:  Suppose  that  S  is  a  looped  schedule  for  an  SDF  graph  and  Ns  c  N( G).  If  we  remove 
from  S  all  actors  that  are  not  in  Ns,  and  remove  all  empty  loops  —  schedule  loops  that  contain  no 
actors  in  their  bodies  —  that  result,  we  obtain  another  looped  schedule,  which  we  call  the  restric¬ 
tion  of  S  to  Ns,  and  which  we  denote  by  restriction (S,  Ns).  For  example,  restriction^ 2(2  B)(5 
A)),  [A,  C})  =  (2(5  A)),  and  restriction^ 5  C),  [A,  B })  is  the  null  schedule.  If  Gs  is  a  subgraph  of 
G,  then  we  define  restriction( S,  Gs)  =  restriction (S,  N( Gs)). 
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The  restriction  of  an  admissable  schedule  S  to  a  subset  of  actors  Ns  fully  specifies  the 
sequence  of  token  populations  occurring  on  each  arc  in  the  corresponding  subsystem.  More  pre¬ 
cisely,  for  any  actor  A  e  Ns,  any  positive  integer  i  such  that  1  <  i  <  inv( A,  S),  and  any  input  arc  a 
of  A  contained  in  subgraph^ Ns),  the  number  of  tokens  queued  on  a  just  prior  to  the  zth  execution 
of  A  in  S  equals  the  number  of  tokens  queued  on  a  just  prior  to  the  zth  invocation  of  A  in  an  exe¬ 
cution  of  restriction (S,  Ns).  Thus,  we  have  the  following  fact. 

Fact  4:  If  S  is  a  valid  schedule  for  an  SDF  graph  G  and  Gs  is  a  subgraph  of  G,  then  restriction( S, 
Gs)  is  a  valid  schedule  for  Gs. 

The  concept  of  blocking  factor  does  not  apply  directly  to  SDF  graphs  that  are  not  con¬ 
nected.  For  example,  in  figure  3  the  minimal  vector  of  repetitions  for  a  periodic  schedule  is  given 


Fig.  3.  A  simple  non-connected  SDF  graph 


by  p(A,  B,  C,  D)  =  (1,  1,  1,  1).  The  schedule  A(2  C)B(2  D)  is  a  valid  schedule  for  this  example, 
but  this  schedule  corresponds  to  a  blocking  factor  of  1  for  subgraphi  { A,  B })  and  a  blocking  factor 
of  2  for  subgraphi  { C,  D } )  —  there  is  no  single  scalar  blocking  factor  associated  with  A(2  C)B(2 
D). 

Now  suppose  that  S  is  a  valid  schedule  for  an  arbitrary  SDF  graph  G.  By  fact  4,  for  each  C 
e  max_connected{ G),  we  have  that  restriciton( S,  C)  is  a  valid  schedule  for  C.  Thus,  associated 
with  S,  there  is  a  vector  of  positive  integers  Js,  indexed  by  the  maximal  connected  subgraphs  of 
G,  such  that  V  C  e  max_connected(G),  V  N  e  MCj,  mv(N,  S)  =  Js(C)qc(N).  We  call  Js  the 
blocking  vector  of  S.  For  example,  if  S  =  A(2  C)B(2  D)  for  figure  3,  then  Js(subgraph({  A,  B }))  = 
1,  and  yis(subgraph({C,  D }))  =  2.  On  the  other  hand,  if  G  is  connected,  then  Js  has  only  one  com¬ 
ponent,  which  is  the  blocking  factor  of  S,  /(S). 

It  is  often  convenient  to  view  parts  of  an  SDF  graph  as  subsystems  that  are  invoked  as  sin¬ 
gle  units.  The  invocation  of  a  subsystem  corresponds  to  invoking  a  minimal  valid  schedule  for  the 
associated  subgraph.  If  this  subgraph  is  connected,  its  repetitions  vector  gives  the  minimum  num- 
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ber  of  invocations  required  for  a  periodic  schedule.  However,  if  the  subgraph  is  not  connected, 
then  the  minimum  number  of  invocations  involved  in  a  periodic  schedule  is  not  necessarily 
obtained  by  concatenating  the  repetitions  vectors  of  the  maximal  connected  subcomponents. 

For  example,  consider  the  subsystem  subgraph({  A,  B,  D,  E})  in  the  SDF  graph  of  figure 
4(a).  It  is  easily  verified  that  q(A,  B,  C,  D,  E)  =  (2,  2,  1,  4,  4).  Thus,  for  a  periodic  schedule,  the 
actors  in  subgraph^ D,  E})  must  execute  twice  as  frequently  as  those  in  subgraph^ A,  B}).  We 
see  that  the  minimal  repetition  rates  for  subgraph^  A,  B.  D.  E()  as  a  subgraph  of  the  original 
graph  are  given  by  p(A,  B,  D,  E)  =  (1,  1,  2,  2),  which  can  be  obtained  dividing  each  corresponding 
entry  in  q  by  gcd( q(A),  q(B),  q(D),  q(E))  =  gcd( 2,  2, 4,  4)  =  21.  On  the  other  hand,  concatenating 
the  repetitions  vectors  of  subgraph^  A,  B })  and  subgraph^  { D.  E})  yields  the  repetition  rates  p'(A, 
B,  D,  E)  =  (1,  1,  1,  1).  However,  repeatedly  invoking  the  subsystem  with  these  relative  rates  can 
never  lead  to  a  periodic  schedule  for  the  enclosing  SDF  graph.  We  have  motivated  the  following 
definition. 

Definition  4:  Let  G  be  a  connected  SDF  graph,  suppose  that  Z  is  a  subset  of  N(G),  and  let  R  = 
subgraph(Z).  We  define  qG(Z)  =  gcd{  { qG(N)  I  N  e  Z}),  and  we  define  qR/G  to  be  the  vector  of  pos¬ 
itive  integers  indexed  by  the  members  of  Z  that  is  defined  by  qR/G(N)  =  qG(N)  /  qG(Z),  V  N  e  Z. 
qG(Z)  can  be  viewed  as  the  number  of  times  a  minimal  periodic  schedule  for  G  invokes  the  sub¬ 
graph  R,  and  we  refer  to  qR/G  as  the  repetitions  vector  of  R  as  a  subgraph  of  G.  For  example,  in 


1.  gcd  denotes  the  greatest  common  divisor. 
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figure  4,  if  R  =  subgraph^  A,  B,  D,  E}),  then  qG(/V(R))  =  2,  and  q^o  =  q^A,  B,  D,  E)  =  (1,  1,  2, 
2). 

Fact  5:  If  G  is  a  connected  SDF  graph  and  R  is  a  connected  subgraph  of  G,  then  qR/G  =  qR.  Thus 
for  a  connected  subgraph  R,  V  N  e  N( R),  qG(N)  =  qG(/V(R))qR(N). 

Proof.  Let  S  be  any  valid  schedule  for  G  of  unit  blocking  factor,  and  let  S'  =  restriction (S, 
R).  Then  from  fact  4  and  fact  2,  for  all  N  e  MR),  we  have  qG(N)  =  7(S')qR(N).  But  from  fact  1, 
we  know  that  the  components  of  qR  are  coprime.  It  follows  that  7(S')  =  gcJ{qG(N')  |  N'  e  M(R)}  = 
qG(M(R)).  Thus,  V  N  e  M(R),  qR(N)  =  qG(N)  /  qG(JV(R))  =  qR/G(N).  QED. 

For  example,  in  figure  4(a),  let  R  =  subgraph^  A,  B }).  We  have  qG(A,  B,  C,  D,  E)  =  (2,  2, 
1,  4,  4),  qR(A,  B)  =  (1,  1),  and  from  definition  4,  qG(M(R))  =  gcd( 2,  2)  =  2,  and  qR/G  =  (2,  2)  /  2  = 
(1,  1).  As  fact  4  assures  us,  qR  =  qR/G. 

Finally,  we  formalize  the  concept  of  clustering  a  subgraph  of  a  connected  SDF  graph  G, 
which  as  we  discussed  above,  is  used  to  organize  hierarchy  for  scheduling  purposes.  This  process 
is  illustrated  in  figure  4.  Here  subgraph^  { A,  B,  D,  E})  of  figure  4(a)  is  clustered  into  the  hierarchi¬ 
cal  node  Q,  and  the  resulting  SDF  graph  is  shown  in  figure  4(b).  Each  input  arc  a  to  a  clustered 
subgraph  R  is  replaced  by  an  arc  a'  having  p( a')  =  p( a),  and  c(a')  =  c( a)  x  qR/G(sink(a)),  the  num¬ 
ber  of  samples  consumed  from  a  in  one  invocation  of  R  as  a  subgraph  of  G.  Similarly  we  replace 
each  output  arc  (3  with  (3'  such  that  c((3')  =  c((3),  and  p((3')  =  p(\ 3)  x  qRiG(source{^)).  We  will  use  the 
following  property  of  clustered  subgraphs. 

Fact  6:  Suppose  G  is  an  SDF  graph,  R  is  a  subgraph  of  G,  G'  is  the  SDF  graph  that  results  from 
clustering  R  into  the  hierarchical  node  Q,  S'  is  a  valid  schedule  for  G',  and  SR  is  a  valid  schedule 
for  R  such  that  V  N  e  N( R),  mv(N,  SR)  =  qR/G(N).  Let  S  denote  the  schedule  that  results  from 
replacing  each  appearance  of  Q  in  S'  with  SR.  Then  S*  is  a  valid  schedule  for  G. 

As  a  simple  example,  consider  figure  4  again.  Now,  (2  Q)C  is  a  valid  schedule  for  the  SDF 
graph  in  figure  4(b),  and  S  =  AB(2  DE)  is  a  valid  schedule  for  R  =  subgraphi  { A.  B,  D,  E})  such 
that  mv(N,  S)  =  qR/G(N)  V  N.  Thus,  fact  6  guarantees  that  (2  AB(2  DE))C  is  a  valid  schedule  for 
figure  4(a). 
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Proof  of  fact  6.  Given  a  schedule  o  and  an  SDF  arc  a,  we  define 

A(a,  o)  =  inv(source(a ),  a)  x p( a)  -  inv(sink{ a),  o)  x  c(a). 

Then  a  is  a  periodic  schedule  iff  it  invokes  each  actor  and  A(a,  a)  =  0  V  a. 

We  can  decompose  S'  into  St  Q  s2  il  ...  il  sk,  where  each  Sj  denotes  the  sequence  of  firings 
between  the  (j  -  l)th  and  j'th  invocations  of  Q.  Then  S  =  Si  SR  s2  SR  . . .  SR  sk. 

First,  suppose  that  0  is  an  arc  in  G  such  that  source(Q),  sink(Q)  £  MR).  Then  SR  contains 
no  occurrences  of  source(Q )  nor  sink(Q),  so  P(Q,  i,  S  )  =  P(Q.  i.  S')  for  any  invocation  number  i  of 
sink(Q).  Thus,  since  S'  is  admissable,  S  does  not  terminate  on  0.  Also,  A(0,  S  )  =  A(0,  si  s2  . . .  sk)  = 
A(0,  S')  =  0,  since  S'  is  periodic. 

If  source(Q),  sink(Q )  <=  N(R),  then  none  of  the  Sj’s  contain  any  occurrences  of  source(Q)  or 

sinkift).  Thus  for  any  i,  P(0,  i,  S  )  =  P(0,  i,  S  )  and  A(0,  S  )  =  A(0,  S  ),  where  S  =  SR  SR  ...  SR 

*  ** 

denotes  S  with  all  of  the  Sj’s  removed.  Since  S  consists  of  successive  invocations  of  a  valid 
schedule,  it  follows  that  S  does  not  terminate  on  0,  and  A(0,  S  )  =  0. 

Now  suppose  that  source(Q )  e  MR),  and  sink(Q)  <t  MR).  Then  corresponding  to  0,  there  is 
an  arc  0'  in  G',  such  that  sourceiQ')  =Q,  sink(® )  =  sink(Q),  p(0')  =  qR/G(sowrce(0))p(0),  and  c(0')  = 
c(0).  Now  each  invocation  of  SR  produces  inv(source(Q),  SR)p(0)  =  qR/G(5'ource(0))p(0)  =  p(O’) 
samples  onto  0.  Since  c(0')  =  c(0)  and  S'  is  a  valid  schedule,  it  follows  that  A(0,  S  )  =  0  and  S  does 
not  terminate  on  0. 

Similarly,  if  source(Q)  £  Af(R),  and  sink(Q )  e  Af(R),  we  see  that  each  invocation  of  SR  con¬ 
sumes  the  same  number  of  samples  from  0  as  Q  consumes  from  the  corresponding  arc  in  G',  and 

*  * 

thus  A(0,  S  )  =  0  and  S  does  not  terminate  on  0. 

We  conclude  that  S  does  not  terminate  on  any  arc  in  G,  and  thus  S  is  admissable.  Further¬ 
more  A(a,  S  )  =  0  for  all  arcs  a  in  G,  and  since  S'  and  SR  are  both  periodic  schedules,  it  easily  ver¬ 
ified  that  S*  invokes  each  actor  in  G  at  least  once,  so  we  conclude  that  S*  is  a  periodic  schedule. 
QED. 

We  conclude  this  section  with  a  fact  that  relates  the  repetitions  vector  of  an  SDF  graph 
obtained  by  clustering  a  subgraph  to  the  repetitions  vector  of  the  original  graph. 
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Fact  7:  If  G  is  a  connected  SDF  graph,  Z  c  MG),  and  G'  is  the  SDF  graph  obtained  from  G  by 
clustering  subgraph^ Z)  into  the  node  Q,  then  €  qG>(Q)  =  qG(Z),  and  V  N  g  Z,  qG.(N)  =  qG(N). 

Proof.  Let  q'  denote  the  vector  that  we  claim  is  the  repetitions  vector  for  G',  and  recall  from  facts 
1  and  2  that  q'  =  qG  iff  q'  satisfies  the  balance  equations  for  G'  and  the  components  of  q'  are 
coprime.  It  can  easily  be  verified  that  q'  satisfies  the  balance  equations  for  G'.  Furthermore,  from 
fact  1,  no  positive  integer  greater  than  one  can  divide  all  members  of  ({qG(N)  I  N  g  Z}  u 
{gcJ({qG(N)  I  N  e  Z}) }).  Since  qG(Z)  =gcd{ qG(N)  I  N  e  Z},  it  follows  that  the  components  of  q' 
are  collectively  coprime.  QED. 

4  Factoring  Schedule  Loops 

In  this  section,  we  show  that  in  a  single  appearance  schedule,  we  can  “factor”  common 
terms  from  the  iteration  counts  of  inner  loops  into  the  iteration  count  of  the  enclosing  loop.  An 
important  practical  advantage  of  factoring  is  that  it  may  significantly  reduce  the  amount  of  mem¬ 
ory  required  for  buffering. 

For  example,  consider  the  SDF  graph  in  figure  5.  One  valid  single  appearance  schedule 
for  this  graph  is  (100  A)(100  B)(10  C)  D.  With  this  schedule,  prior  to  each  invocation  of  C,  100 
tokens  are  queued  on  each  of  C’s  input  arcs,  and  a  maximum  of  10  tokens  are  queued  on  D’s  input 
arc.  Thus  210  words  of  memory  are  required  to  implement  the  buffering  for  this  schedule. 

Now  observe  that  this  schedule  induces  the  same  firing  sequence  as  (1  (100  A)(100  B)(10 
C))  D.  The  result  developed  in  this  section  allows  us  to  factor  the  common  divisor  of  10  in  the 
iteration  counts  of  the  three  inner  loops  into  the  iteration  count  of  the  outer  loop.  This  yields  the 


Fig.  5.  An  SDF  graph  used  to  illustrate  the  factoring  of  loops.  For  this  graph,  q(A,  B,  C,  D)  =  (100, 
100,  10,  1). 
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new  single  appearance  schedule  (10  (10  A)(10  B)C)D,  for  which  at  most  ten  tokens  simulta¬ 
neously  reside  on  each  arc.  Thus  this  factoring  application  has  reduced  the  buffering  requirement 
by  a  factor  of  7. 

There  is,  however  a  trade-off  involved  in  factoring,  For  example,  the  schedule  (100 
A)(100  B)(10  C)D  requires  3  loop  initiations  per  schedule  period,  while  the  factored  schedule  (10 
(10  A)(10  B)C)D  requires  21.  Thus  the  runtime  cost  of  starting  loops  —  usually,  initializing  the 
loop  indices  —  has  increased  by  the  same  factor  by  which  the  buffering  cost  has  decreased.  How¬ 
ever  the  loop-startup  overhead  is  normally  much  smaller  than  the  penalty  that  is  paid  when  the 
memory  requirement  exceeds  the  on-chip  limits.  Unfortunately,  we  cannot  in  general  perform  the 
reverse  of  the  factoring  transformation  —  i.e.  moving  a  factor  of  the  outer  loop’s  iteration  count 
into  the  inner  loops.  This  reverse  transformation  would  be  desirable  in  situations  where  minimiz¬ 
ing  buffering  requirements  is  not  critical. 

In  this  section,  we  prove  the  validity  of  factoring  for  an  arbitrary  “factorable”  loop  in  a 
single  appearance  schedule. 

Definition  5:  Given  a  schedule  S0,  we  denote  the  set  of  actors  that  appear  in  S0  by  actors( S0).  For 
example,  actors(( 2  (2  B)(5  A)))  =  {A,  B}  and  actors{{ 3  X(2  Y(3  Z)X)))  =  {X,  Y,  Z}. 

Lemma  1 :  Suppose  that  S  is  a  single  appearance  schedule  (that  is  not  necessarily  a  valid  sched¬ 
ule)  for  the  SDF  graph  G,  and  S0  is  a  subschedule  of  S  such  that  S0  is  a  valid  schedule  for  sub- 
graph(actors( S0),  G).  Then  S  does  not  terminate  on  any  arc  0  for  which  source(Q),  sink(Q )  e 
actors(  S0). 

For  example,  suppose  that  S  is  the  schedule  D(2  A(2  BC))E  for  the  SDF  graph  in  figure  6, 
and  S0  is  the  subschedule  (2  A(2  BC)).  Lemma  1  guarantees  that  S  does  not  terminate  on  any  arc 
that  is  contained  in  subgraph{{ABC})\  No  matter  what  the  values  of  the  delays  { dj }  are,  S  does 
not  terminate  on  the  arc  from  A  to  B,  nor  the  arc  from  A  to  C. 

Proof  of  lemma  1 :  Since  S  is  a  single  appearance  schedule,  source(Q )  and  sink(Q)  are  invoked  only 
through  invocations  of  S0.  Since  S0  is  admissable,  the  number  of  samples  on  0  prior  to  each  invo¬ 
cation  of  sink(Q )  is  at  least  c(0).  Thus  S  does  not  terminate  on  0.  QED. 
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Lemma  2:  Suppose  that  G  is  an  SDF  graph,  S  is  an  admissable  looped  schedule  for  G,  and  S0  is  a 
subschedule  of  S.  Suppose  also  that  S0'  is  any  looped  schedule  such  that  actors{ S0')  =  actors( S0), 
and  inv( N,  S0)  =  inv( N,  S0')  V  N  e  actors( S0).  Let  S'  denote  the  schedule  obtained  by  replacing  S0 
with  S0'  in  S.  Then  S'  does  not  terminate  on  any  arc  0  that  is  not  contained  subgraph(actors{ S0), 
G);  equivalently,  ( sourceiQ )  g  actors( S0)  or  sink(Q )  g  actors{ S0))  =>  S'  does  not  terminate  on  0. 

Again  consider  the  example  in  figure  6  and  suppose  that  D(2  A(2  BC))E  is  an  admissable 
schedule  for  this  SDF  graph.  Then  lemma  2  (with  S0  =  A(2  BC),  and  S0'  =  BCABC)  tells  us  that 
D(2  BCABC)E  does  not  terminate  on  any  of  the  four  arcs  that  lie  outside  of  subgraph({  A,  B,  C}). 

Before  moving  to  the  proof,  we  emphasize  that  lemma  2  applies  to  general  looped  sched¬ 
ules,  not  just  single  appearance  schedules. 

Proof  of  lemma  2:  Let  0  be  any  arc  that  is  not  contained  in  subgraph(actors( S0),  G).  Let  i 
be  any  invocation  number  of  sink(Q)',  that  is,  1  <  i  <  inv(sink(Q),  S').  The  sequence  of  invocations 
fired  in  one  period  of  S  can  be  decomposed  into  (si  b-,  s2  b2  ...  bn  sn+1),  where  bj  denotes  the 
sequence  of  firings  associated  with  the  j'th  invocation  of  subschedule  S0,  and  Sj  is  the  sequence  of 
firings  between  the  (j  — l)th  and  /th  invocations  of  S0.  Since  S'  is  derived  by  rearranging  the  firings 
in  S0,  we  can  express  it  similarly  as  (si  b-,'  s2  b2'  ...  bn'  sn+1),  where  bj'  corresponds  to  the  /th  invo¬ 
cation  of  S0'  in  S'. 

If  neither  source(Q )  nor  sink(Q )  is  contained  in  actors( S0),  then  none  of  the  bj’s  nor  any  of 
the  bj”s  contain  any  occurrences  of  sink(Q )  or  sourceiQ).  Thus  P(Q,  i,  S)  =  P(0,  i,  s-\  s2  ...  sn+1)  = 
P(0,  i.  S'). 


Fig.  6.  An  example  used  to  illustrate  the  application  of  lemmas  1  and  2.  Each  d,  represents  the 
number  of  delays  on  the  corresponding  arc.  Here  q(A,  B,  C,  D,  E)  =  (2,  4,  4,  1 ,  1 ). 
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Now  suppose  sourceiQ )  e  actors( S0)  and  sink(Q )  £  acforsfSo).  Let  k  denote  the  number  of 
invocations  of  S0  that  precede  sink(Q)\  in  S.  Then,  since  inv(.sink(Q),  bj)  =  inv(sink(Q),  bj')  =  0  V  /, 
we  have  that  k  invocations  of  S0'  precede  sink(Q\  in  S'.  It  follows  that  P(Q,  i,  S)  =  P(Q,  i,  s-\  s2  ... 
sn+1)  +  k  x  mv(5owrce(0),  S0),  and  P(0,  i.  S')  =  P(0,  i,  s-j  s2  . . .  sn+1)  +^x  inv(source(Q),  S0').  But,  by 
assumption,  inv(source(Q),  S0)  =  inv(source(Q),  S0'),  so  P(0,  i,  S)  =  P(0,  i,  S'). 

Finally,  suppose  source(Q )  £  actors{ S0)  and  sink(Q )  e  acforsfSo).  There  are  two  sub-cases 
to  consider  here:  (1)  In  S,  sink(Q\  occurs  in  one  of  the  Sj’s,  say  sk.  Since  inv(sink(Q),  S0)  = 
inv(sink(Q),  S0'),  it  follows  that  in  S',  sink(Q)\  occurs  in  sk  as  well.  Since  source(Q)  <£  actors( S0),  we 
have  P(0,  i,  S)  =  P(0,  i  -(k  -  1  )inv(sink(Q),  S0),  s-\  s2  ...  sk)  =  P(0,  i  -(k -  1  )inv(sink(Q),  S0'),  St  s2 
...  sk)  =  P(0,  i,  S').  (2)  In  S,  sink(0)j  occurs  in  one  of  the  bj's,  say  bm.  Then  mv(sink(0),  S0)  = 
mv(sink(0),  S0')  implies  that  in  S',  sink(0)j  occurs  in  bm'.  Since  sourceiQ )  <£  actors( S0),  P(0,  i,  S)  = 
inv(source{Q),  s1  s2  ...  sm)  =P(0,  i.  S'). 

Thus,  for  arbitrary  i,  P(0,  i,  S)  =  P(Q.  i,  S')  From  the  admissability  of  S,  it  follows  that  S' 
does  not  terminate  on  0.  QED. 

The  following  theorem  establishes  the  validity  of  our  factoring  transformation. 

Theorem  1 :  Suppose  that  S  is  a  valid  single  appearance  schedule  for  G  and  suppose  that  L  =  (m 
(«i  S-|)  ( n2  S2)  ...  (nk  Sk))  is  a  schedule  loop  within  S  of  any  nesting  depth.  Suppose  also  that  yis 
any  positive  integer  that  divides  n-\,n2,  . . .,  nk,  and  let  L'  denote  the  loop  (ym  (y“1/r1  Si)  (y  1//2  S2)  ... 
(y 1/7k  Sk)).  Then  replacing  L  with  L'  in  S  results  in  a  valid  schedule  for  G. 

Proof:  We  will  use  the  following  definition  in  our  proof  of  this  theorem. 

Definition  6:  Given  a  schedule  loop  L  in  S  and  an  arc  0  in  G,  we  define  consumed(Q,  L)  to  be  the 
number  of  samples  consumed  from  0  by  sink(Q )  during  one  invocation  of  L.  Similarly,  we  define 
produced(Q ,  L)  to  be  the  number  of  samples  produced  onto  0  during  one  invocation  of  L.  Clearly, 
if  the  number  of  samples  on  0  is  at  least  consumed(Q ,  L)  just  prior  to  a  particular  invocation  of  L, 
then  S  will  not  terminate  on  0  during  that  invocation  of  L. 

Let  S'  denote  the  schedule  that  results  from  replacing  L  with  L'  in  S.  By  construction  of  S', 
we  have  that  for  all  actors  A  in  G,  mv( A,  S')  =  inv( A,  S).  Since  S  is  valid,  and  hence  periodic,  it 
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follows  that  S'  is  a  periodic  schedule.  It  remains  to  be  shown  that  S'  is  admissable.  We  will  dem¬ 
onstrate  this  by  induction  on  k. 

First,  observe  that  for  k  =  1,  L  and  L'  generate  the  same  firing  sequence,  and  thus  S  and  S' 
generate  the  same  firing  sequence.  We  conclude  that  S'  is  admissable  for  k  =  1. 

Now  consider  the  case  k  =  2.  Then  L  =  (m  (nA  S1  )(/?2  S2))  and  L'  =  (ym  (y  V \  S1)(y'1n2  S2)). 
By  construction,  J( S')  =  7(S)  and  S'  is  also  a  single  appearance  schedule.  Now  Let  0  be  an  arc  in 
G.  If  source(Q )  e  actors( SQ  and  sink(Q )  e  actors( S2)  then 

produced(Q,  (y1^  S-|))  =  7(S)qG(5'OMrce(0))p(0)  /  inv((y^n:  SQ,  S') 

=  7(S)qG(5'OMrce(0))p(0)  /  (ym  x  mv(L',  S')) 

=  7(S)qG(sM(0))c(0)  /  (ym  x  mv(L',  S'))  (by  fact  2) 

=  consumed(Q,  (y_1n2  S2)). 

Similarly,  if  source(Q )  e  actors( S2)  and  sink(Q )  e  actors( S-|),  produced(Q,  (y  1n2  S2))  = 
consumed(Q ,  (y Si)).  Summarizing,  we  have 

source(Q)  e  actors( Si),  sink(Q )  e  actors( S2)=> produced(Q,  (y 1ni  Si))  =  consumed(Q ,  (y  1n2  S2)); 

and 

5OMrce(0)  e  actors( S2),  sink(Q )  e  actor^lS!)^ produced(Q ,  (y  1n2  S2))  =  consumed(Q,  (y 

\  SO).  (EQ  1) 

Now  we  will  show  that  S  does  not  terminate  on  0  for  an  arbitrary  arc  0  in  G. 

Case  1:  soi/rcel©)  e  actor^lS^,  sink(Q )  e  actors( S2).  From  EQ  1,  we  know  that  prior  to 
each  invocation  of  (y  1n2  S2),  at  least  consumedi 0,  (y  1/;2  S2))  samples  reside  on  0.  Thus  S'  never 
terminates  on  0  during  an  invocation  of  (y 1n2  S2).  Furthermore,  since  S'  is  a  single  appearance 
schedule,  sink(Q )  is  fired  only  through  invocations  of  (y  1//2  S2),  and  it  follows  that  S'  does  not  ter¬ 
minate  on  0. 

Case  2:  source(Q )  e  actors( S2)  and  sink(Q )  e  actora(Si).  Since  S  is  an  admissable  sched¬ 
ule,  delay(Q )  >  consumedi 0,  («i  SQ),  otherwise  S  would  terminate  on  0  during  the  first  invocation 
of  («i  Si).  Since  y>  1,  it  follows  that  delay(Q )  >  consumed (0,  (y1^  SQ),  so  S'  does  not  terminate 
on  0  during  the  first  invocation  of  (y 1ni  SQ.  From  EQ  1,  we  know  that  prior  to  each  subsequent 
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invocation  of  (y1^  Si),  at  least  consumed(Q,  (y  1/z1  Si))  samples  reside  on  0,  so  S'  does  not  termi¬ 
nate  on  0  for  invocations  2,  3,  4,  ...  of  (y  1n1  Si).  We  conclude  that  S'  does  not  terminate  on  0. 

Case  3:  sourceiQ ),  sink(Q )  <=  actors( Si).  Since  S  is  a  valid  single  appearance  schedule,  Si 
must  be  a  valid  schedule  for  subgraph(actors( Si)).  Applying  lemma  1  with  S0  =  Si,  we  see  that  S' 
does  not  terminate  on  0. 

Case  4:  source(Q),  sink(Q)  e  actors( S2).  From  lemma  1  with  S0  =  S2,  S'  does  not  terminate 

on  0. 

Case  5:  source(Q )  £  (actors( Si)  u  actors( S2)),  or  sink(Q )  £  (actors{ Si)  u  actors( S2)). 
Applying  lemma  2  with  S0  =  L  and  S0'  =  L',  we  see  that  S'  does  not  terminate  on  0. 

From  our  conclusions  in  cases  1-5,  S'  does  not  terminate  on  any  arc  in  0,  and  it  follows 
that  S'  is  an  admissable  schedule.  Thus  theorem  1  holds  for  k  =  2. 

Now  suppose  that  theorem  1  holds  whenever  k  <  k',  for  some  k'  >  2.  We  will  show  that  this 
implies  the  validity  of  theorem  1  for  k  <  k’  +  1.  For  k  =  k'  +  1,  L  =  (m  (n-\  Si)  ( n2  S2)  ...  (%+i  Sk+i)) 
and  L'  =  (ym  (y1^  Si)  (y  1//2  S2)  ...  (y_1  «k-+i  Sr’+i )) •  Let  Sa  denote  the  schedule  that  results  from 
replacing  L  with  the  loop  La  =  ( m  (1  (, n-\  Si)  (n2  S2)  ...  (nw  Sk))  (nw+ 1  Sk+i)).  Since  La  and  L  induce 
the  same  firing  sequence,  Sa  induces  the  same  firing  sequence  as  S.  Now  theorem  1  for  k  =  k' 
guarantees  that  replacing  (1  (n:  Si)  ( n2  S2)  ...  (nw  Sk))  with  (y(y ' 1ni  Si)  (y  1n2  S2)  ...  (yAnw  Sk))  in 
Sa  results  in  a  valid  schedule  Sb. 

Observe  that  Sb  is  the  schedule  S  with  L  replaced  by  Lb  =  (m  (y  (y ' Si)  (y  1/;2  S2)  ...  (y 
^nw  Sk))  (nk+i  Sk+i)).  Theorem  1  for  k  =  2  guarantees  that  replacing  Lb  with  Lc=  (y in  (1  (y 1/?i  Si) 
(Y  W  S2)...(y1%Sk.))  (Y1»k-+i  Sk+i))  yields  another  valid  schedule  Sc.  Now  Lc  yields  the  same 
firing  sequence  as  L'  =  (y m  (y_1/7i  Si)  (y  1//2  S2)  ...  (y^n^+i  Sk+i)),  so  replacing  Lc  with  L'  in  Sc 
yields  an  admissable  schedule  Sd.  But,  by  our  construction,  Sd  =  S',  so  S'  is  a  valid  schedule  for  G. 

We  have  shown  that  theorem  1  holds  for  k  =  1  and  k  =  2,  and  we  have  shown  that  if  the 
result  holds  for  k  <  k\  then  it  holds  for  k  <  (k'  +  1).  We  conclude  that  theorem  1  holds  for  all  k. 
QED. 

Note  that  factoring  is  not  necessarily  a  legitimate  transformation  in  admissable  schedules 
that  are  not  single  appearance  schedules.  As  a  counter-example,  consider  the  SDF  graph  in  figure 
7.  One  can  easily  verify  that  the  looped  schedule  A(2  B)(2  CCBB)CC  is  a  valid  schedule  for  this 
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Fig.  7.  An  example  used  to  illustrate  that  factoring  is  not  necessarily  valid  for  schedules  that  are  not 
single  appearance  schedules.  If  the  two  adjacent  schedule  loops  in  the  valid  schedule  A(2  B)(2 
CCBBjCC  are  factored,  the  resulting  schedule,  A(2  BCCBBjCC,  terminates  on  the  input  arc  of  C 
at  invocation  C2. 

SDF  graph.  If  we  factor  the  two  adjacent  schedule  loops,  which  have  a  common  iteration  count, 
we  obtain  the  looped  schedule  A(2  BCCBBjCC.  The  firing  sequence  generated  by  this  latter 
looped  schedule,  ABCCBBBCCBBCC,  terminates  on  the  input  arc  of  C  at  invocation  C2.  Thus, 
factoring  has  converted  a  valid  schedule  into  a  schedule  that  is  not  valid. 

5  Reduced  Single  Appearance  Schedules 

We  begin  this  section  with  a  definition. 

Definition  7:  Suppose  that  A  is  either  a  schedule  loop  or  a  looped  schedule.  We  say  that  A  is 
non-coprime  if  all  iterands  of  A  are  schedule  loops,  and  there  exists  an  integer  j  >  1  that  divides 
all  of  the  iteration  counts  of  the  iterands  of  A.  If  A  is  not  non-coprime,  we  say  that  A  is  coprime. 

For  example,  the  schedule  loops  (3  (4  A)(2  B))  and  (10  (7  C))  are  both  non-coprime,  while 
the  loops  (5  (3  A)(7  B))  and  (70  C)  are  coprime.  Similarly,  the  looped  schedules  (4  AB)  and  (6 
AB)(3  C)  are  both  non-coprime,  while  the  schedules  A(7  B)(7  C)  and  (2  A)(3  B)  are  coprime. 
From  our  discussion  in  the  previous  section,  we  know  that  non-coprime  schedule  loops  may  result 
in  much  higher  buffering  requirements  than  their  factored  counterparts. 

Definition  8:  Given  a  single  appearance  schedule  S,  we  say  that  S  is  fully  reduced  if  S  is 
coprime,  and  every  schedule  loop  contained  in  S  coprime. 

In  this  section,  we  show  that  we  can  always  convert  a  valid  single  appearance  schedule 
that  is  not  fully  reduced  into  a  valid  fully  reduced  schedule.  Thus,  we  can  always  avoid  the  over¬ 
head  associated  with  using  non-coprime  schedule  loops  over  their  corresponding  factored  forms. 
To  prove  this,  we  use  another  useful  fact:  that  any  fully  reduced  schedule  has  blocking  factor  1. 
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This  implies  that  any  schedule  that  has  blocking  factor  greater  than  one  is  not  fully  reduced.  Thus, 
if  we  decide  to  generate  a  schedule  that  has  nonunity  blocking  factor,  then  we  risk  introducing 
higher  buffering  requirements. 

Theorem  2:  Suppose  that  S  is  a  valid  single  appearance  schedule  for  a  connected  SDF  graph  G. 
If  S  is  fully  reduced  then  S  has  blocking  factor  1. 

Proof:  First,  suppose  that  not  all  iterands  of  S  are  schedule  loops.  Then  some  actor  N  appears  as 
an  iterand.  Since  N  is  not  enclosed  by  a  loop  in  S,  and  since  S  is  a  single  appearance  schedule, 
mv(N,  S)  =  1,  and  thus  J{ S)  =  1. 

Now  suppose  that  all  iterands  of  S  are  schedule  loops,  and  suppose  that  j  is  an  arbitrary 
integer  that  is  greater  that  one.  Then  since  S  is  fully  reduced,  j  does  not  divide  at  least  one  of  the 
iteration  counts  associated  with  the  iterands  of  S.  Define  z0  =  1  and  let  L-,  denote  one  of  the  iter¬ 
ands  of  S  whose  iteration  count  i ^  is  not  divisible  by  j  =j  /  gcd(j,  z0).  Again,  since  S  is  fully 
reduced,  if  all  iterands  of  L1  are  schedule  loops  then  there  exists  an  iterand  L2  of  L1  such  that  j  / 
gcd(j,  z'qZ-i)  does  not  divide  the  iteration  count  z2  of  L2.  Similarly,  if  all  iterands  of  L2  are  schedule 
loops,  there  exists  an  iterand  L3  of  L2  whose  iteration  count  z3  is  not  divisible  by  j  /  gcd(j,  z'qz'i^)- 

Continuing  in  this  manner,  we  generate  a  sequence  Ll5  L2,  L3,  ...  such  that  the  iteration 
count  zk  of  each  Lk  is  not  divisible  by  j  /  gcd(j,  z0z‘i  • .  .zk_i).  Since  G  is  of  finite  size,  we  cannot  con¬ 
tinue  this  process  indefinitely  —  for  some  m  >  1,  not  all  iterands  of  Lm  are  schedule  loops.  Thus, 
there  exists  an  actor  N  that  is  an  iterand  of  Lm.  Since  S  is  a  single  appearance  schedule, 

znv(N,  S)  =  zVzv(L-|,  S)znv(L2,  Li)mv(L3,  L2)  ...  inv( Lm,  Lm_1)znv(N,  Lm)  =  i0if2  •••  *m- 
By  our  selection  of  the  Lk’s,  /  /  gcd(j,  z0z- p2  ...  zm-i)  does  not  divide  zm,  and  thus  j  does  not  divide 
inv{ N,  S). 

We  have  shown  that  given  any  integer  j  >  1,  3  N  e  N(  G)  such  that  inv(  N,  S)  is  not  divisible 
by  j.  It  follows  that  S  has  blocking  factor  1.  QED. 

Theorem  3:  If  an  SDF  graph  G  has  a  valid  single  appearance  schedule,  then  G  has  a  valid  fully 
reduced  schedule. 
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Proof.  We  prove  theorem  3  by  construction.  This  construction  process  can  easily  be  automated  to 
yield  an  efficient  algorithm  for  synthesizing  a  valid  fully  reduced  schedule  from  an  arbitrary  valid 
single  appearance  schedule. 

Given  a  looped  schedule  VF,  we  define  non-coprime QP)  to  be  the  set  of  schedule  loops  in  VF 
that  are  non-coprime.  Now  suppose  that  S  is  a  valid  single  appearance  schedule  for  G,  and  let  X^  = 
(m  (n1  vf/1)  (n2  VP2)  ...  («k  vFk))  be  any  innermost  member  of  non-coprime( S)  —  i.e.  X^  is  non¬ 
coprime,  but  every  loop  nested  within  X-\  is  coprime.  From  theorem  1,  replacing  X-j  with  A-,'  =  (ym 
(y"1/71  4^)  (y  1/?2  yV2)  •••  (Y1/7k  ^kj),  where  y=  gcd({n:,  n2.  ....  /7k}),  yields  another  valid  single 
appearance  schedule  S-|.  Furthermore,  A-,'  is  coprime,  and  since  every  loop  nested  within  A-,  is 
coprime,  every  loop  nested  within  AY  is  coprime  as  well.  Now  let  A2  be  any  innermost  member  of 
non-coprime( Si),  and  observe  that  A2  cannot  equal  A/.  Theorem  1  guarantees  a  replacement  X2 
for  X2  that  leads  to  another  valid  single  appearance  schedule  S2.  If  we  continue  this  process,  it  is 
clear  that  no  replacement  loop  Ak'  ever  replaces  one  of  the  previous  replacement  loops  XX  X2  ... 
W,  since  these  loops,  and  the  loops  nested  within  these  loops,  are  already  coprime.  Also,  no 
replacement  changes  the  total  number  of  loops  in  the  schedule.  It  follows  that  we  can  continue 
this  process  only  a  finite  number  of  times  —  eventually,  we  will  arrive  at  an  Sn  such  that  non¬ 
cop  rime(  Sn)  is  empty. 

Now  if  Sn  is  a  coprime  looped  schedule,  we  are  done.  Otherwise,  Sn  is  of  the  form  (/q  Zf) 
(p2  Z2)  •••  (Pm  Zm),  where  f  =gcd{{p-\,  p2,  ... ,pm })  >  1.  Applying  theorem  1  to  the  schedule  (1  Sn) 
=  C1  (Pi  z0  (P2  Z2)  •  •  •  Om  ZJ),  we  have  that  (f  ((Y)'1p1  Zf)  ((Y)'1p2  Z^) ...  ((Y)'1pm  Zm))  is  a  valid 
schedule  for  G.  From  the  definition  of  a  valid  schedule,  it  follows  that  Sn'  =  ((Y)_1Pi  ((f)^p2 

Zf)  ...  ((Y )Apm  Zm)  is  also  a  valid  schedule,  and  by  our  construction  of  Sn  and  Sn',  Sn'  is  coprime, 
and  all  schedule  loops  in  Sn'  are  coprime.  Thus  Sn'  is  a  valid  fully  reduced  schedule  for  G.  QED. 

6  Constructing  Single  Appearance  Schedules 

Since  valid  single  appearance  schedules  implement  the  full  repetition  inherent  in  an  SDF 
graph  without  requiring  subroutines  or  code  duplication,  we  examine  the  topological  conditions 
required  for  such  a  schedule  to  exist.  First  suppose  that  G  is  an  acyclic  SDF  graph  containing  n 
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nodes.  Then  we  can  take  some  root  node  ri  of  G  and  fire  all  qG(r-|)  invocations  of  1^  in  succession. 
After  all  invocations  of  ^  have  fired,  we  can  remove  r1  from  G,  pick  a  root  node  r2  of  the  new  acy¬ 
clic  graph,  and  schedule  its  qG(r2)  repetitions  in  succession.  Clearly,  we  can  repeat  this  process 
until  no  nodes  are  left  to  obtain  the  single  appearance  schedule  (qG(ri)  r-|)  (qG(r2)  r2)  ...  (qG(r„)  rn) 
for  G.  Thus  we  see  that  any  acyclic  SDF  graph  has  a  valid  single  appearance  schedule. 

Also,  observe  that  if  G  is  an  arbitrary  SDF  graph,  then  we  can  cluster  the  subgraphs  asso¬ 
ciated  with  each  nontrivial  strongly  connected  component  of  G.  Clustering  a  strongly  connected 
component  into  a  single  node  never  results  in  deadlock  since  there  can  be  no  directed  loop  con¬ 
taining  the  clustered  node.  Since  clustering  all  strongly  connected  components  yields  an  acyclic 
graph,  it  follows  from  fact  4  and  fact  6  that  G  has  a  valid  single  appearance  schedule  if  and  only 
if  each  strongly  connected  component  has  a  valid  single  appearance  schedule. 

Observe  that  we  must,  in  general,  analyze  a  strongly  connected  component  R  as  a  separate 
entity,  since  G  may  have  a  valid  single  appearance  schedule  even  if  there  is  a  node  N  in  R  for 
which  we  cannot  fire  all  qG(N)  invocations  in  succession.The  key  is  that  qR  may  be  less  than  qG, 
so  we  may  be  able  to  generate  a  single  appearance  subschedule  for  R  (e.g.  we  may  be  able  to 
schedule  N  qR(N)  times  in  succession).  Since  we  can  schedule  G  so  that  R’s  subschedule  appears 
only  once,  this  will  translate  to  a  single  appearance  schedule  for  G.  For  example,  in  figure  8(a),  it 
can  be  verified  that  q(A,  B,  C)  =  (10,  4,  5),  but  we  cannot  fire  so  many  invocations  of  A,  B,  nor  C 
in  succession.  However,  consider  the  strongly  connected  component  R*  consisting  of  nodes  A 
and  B.  Then  we  obtain  qR*(A)  =  5  and  qR.(B)  =  2,  and  we  immediately  see  that  qR.(B)  invocations 
of  B  can  be  scheduled  in  succession  to  obtain  a  subschedule  for  R*.  The  SDF  graph  that  results 


5 


(a)  (b) 

Fig.  8.  An  example  of  how  clustering  strongly  connected  components  can  improve  looping. 
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from  clustering  {A,  B}  is  shown  in  figure  8(b).  This  leads  to  the  valid  single  appearance  schedule 
(2  (2  B)(5  A))(5  C). 

Theorem  4:  Suppose  that  G  is  a  connected  SDF  graph  and  suppose  that  G  has  a  valid  single 
appearance  schedule  of  some  arbitrary  blocking  factor.  Then  G  has  valid  single  appearance  sched¬ 
ules  for  all  blocking  factors. 

Proof1:  Clearly,  any  valid  schedule  S  of  unity  blocking  factor  can  be  converted  into  a  valid  sched¬ 
ule  of  arbitrary  blocking  factor  j  simply  by  encapsulating  S  inside  a  loop  of  j  iterations.  Thus,  it 
suffices  to  show  that  G  has  a  valid  single  appearance  schedule  of  unity  blocking  factor.  Now,  the¬ 
orem  3  guarantees  that  G  has  a  valid  fully  reduced  single  appearance  schedule,  and  theorem  2  tells 
us  that  this  schedule  has  blocking  factor  1.  QED. 

Corollary  1 :  Suppose  that  G  is  an  arbitrary  SDF  graph  that  has  a  valid  single  appearance  sched¬ 
ule.  Then  G  has  a  valid  single  appearance  schedule  for  all  blocking  vectors. 

Proof.  Suppose  S  is  a  valid  single  appearance  schedule  for  G,  let  R1;  R2,  ...,  Rk  denote  the  maxi¬ 
mal  connected  subgraphs  of  G,  let  J  (R1?  R2 . Rk)  =  (z-i,  z2,  •  •  •>  A)  be  an  arbitrary  blocking  vec¬ 

tor  for  G,  and  for  1  <i<k,  let  Sj denote  the  restriction  of  S  to  Rj.  Then  from  fact  4  each  Sj  is  a  valid 
single  appearance  schedule  for  the  corresponding  R,.  From  theorem  4,  for  1  <  i  <  k,  there  exists  a 
valid  single  appearance  schedule  Sj’  of  blocking  factor  Z\  for  Rj.  Since  the  Rj’s  are  mutually  disjoint 
and  non-adjacent,  it  follows  that  S/  S2'  ...  Sk'  is  a  valid  single  appearance  schedule  of  blocking 
vector  J*  for  G.  QED. 

Our  condition  for  the  existence  of  a  valid  single  appearance  schedule  involves  a  form  of 
precedence  independence  that  we  call  subindependence. 


1.  An  alternative  proof  has  been  suggested  by  Sebastian  Ritz  of  the  Aachen  University  of  Technology.  This  proof  is  based  on  the 
observations  that  (1)  constructing  blocking  factor  1  schedules  for  acyclic  graphs  is  easy  —  we  simply  use  the  process  described  at 
the  beginning  of  this  section,  and  (2)  if  a  strongly  connected  SDF  graph  G  has  a  single  appearance  schedule  then  it  has  a  subinde¬ 
pendent  subset  of  nodes  (see  definition  9),  which  allows  us  to  decompose  G  into  smaller  collections  of  strongly  connected  compo¬ 
nents.  By  hierarchically  scheduling  the  input  graph  based  on  observations  (1)  and  (2),  we  can  always  construct  a  blocking  factor  1 
schedule. 
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Definition  9:  Suppose  that  G  is  a  connected  SDF  graph.  If  Z-,  and  Z2  are  disjoint  subsets  of  MG) 
we  say  that  “Z-,  is  subindependent  of  Z2  in  G”  if  for  every  arc  a  in  G  such  that  source(a)  e  Z2 
and  sinkio.)  e  Z1?  we  have  delay  (a)  >  qG(sink(a))c(a). 

Thus,  Z-|  is  subindependent  of  Z2  in  G  if  given  a  minimal  (unit  blocking  factor)  valid 
schedule  for  G,  data  produced  by  Z2  is  never  consumed  by  Z^  in  the  same  schedule  period  of  G  in 
which  it  is  produced.  Thus,  at  the  beginning  of  each  schedule  period  for  G,  all  of  the  data  required 
by  Z-|  from  Z2  for  that  schedule  period  is  available  at  the  inputs  of  ZA .  For  example,  consider  the 
SDF  graph  in  figure  9.  Here  q(A,  B,  C,  D)  =  (2,  1,  2,  2,),  and  we  see  that  {A,  D}  is  subindepen¬ 
dent  of  {B,  C}  and  trivially,  {B,  C,  A}  is  subindependent  of  { D } . 

We  are  now  ready  to  establish  a  recursive  condition  for  the  existence  of  a  valid  single 
appearance  schedule.  Recall  that  an  arbitrary  SDF  graph  has  a  valid  single  appearance  schedule 
iff  each  strongly  connected  component  has  a  single  appearance  schedule.  Theorem  5  gives  neces¬ 
sary  and  sufficient  conditions  for  a  strongly  connected  SDF  graph  to  have  a  valid  single  appear¬ 
ance  schedule. 

Theorem  5:  Suppose  that  G  is  a  nontrivial  strongly  connected  SDF  graph.  Then  G  has  a  valid 
single  appearance  schedule  if  and  only  if  there  exists  a  nonempty  proper  subset  Ns  of  MG)  such 
that 

(1)  Ns  is  subindependent  of  (MG)  -  Ns)  in  G;  and 

(2)  subgraph{  Ns.  G)  and  subgraph((N(  G)  -  Ns),  G)  both  have  valid  single  appear¬ 
ance  schedules. 

Proof:  <=  Let  S  and  T  denote  valid  single  appearance  schedules  for  Y  =  subgraph^ Ns,  G) 
and  Z  =  subgraph((N( G)  -  Ns),  G)  respectively;  let  yl5  y2,  ...,yk  denote  the  maximal  connected 


Fig.  9.  An  example  used  to  illustrate  subindependence. 
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subsets  of  N( Y);  and  let  zl5  z2, ,Z|  denote  the  maximal  connected  subsets  of  MZ).  From  corol¬ 
lary  1,  we  can  assume  without  loss  of  generality  that  for  1  <  i  <  k,  Js(subgraph(y\))  =  qG(yi),  and 
that  for  1  <  i  <  l,  ^(subgraphiz^j)  =  qG(Z|).  From  fact  5,  it  follows  that  S  invokes  each  N  e  Ns 
qG(N)  times,  and  T  invokes  each  N  e  (MG)  -  Ns)  qG(N)  times,  and  since  Ns  is  subindependent,  it 
follows  that  (S  T)  is  a  valid  single  appearance  schedule  (of  blocking  factor  1)  for  G. 

=>  Suppose  that  S  is  a  valid  single  appearance  schedule  for  G.  From  theorem  4,  we  can 
assume  without  loss  of  generality  that  S  has  blocking  factor  1.  Then  S  can  be  expressed  as  SaSb, 
where  Sa  and  Sb  are  nonempty  single  appearance  subschedules  of  S  that  are  not  encompassed  by  a 
loop  (if  we  could  represent  S  as  a  single  loop  (n  (...)  (...)  ...(...))  then  gcd({qG(N)  |  N  e  A(G)}) 
>  n,  so  S  is  not  of  unity  blocking  factor  —  a  contradiction).  Since  SaSb  is  a  valid  single  appear¬ 
ance  schedule  for  G,  every  actor  N  e  actors( Sa)  is  fired  qG(N)  times  before  any  actor  outside  of 
actors( Sa)  is  invoked.  It  follows  that  aclors(Sa)  is  subindependent  of  actors( Sb).  Also  by  fact  4,  Sa 
is  a  valid  single  appearance  schedule  for  subgraph{actors{ Sa))  and  Sb  is  a  valid  single  appearance 
schedule  for  subgraph{actors{ Sb)).  QED. 

Theorem  5  shows  that  a  strongly  connected  SDF  graph  G  has  a  valid  single  appearance 
schedule  only  if  we  can  find  a  subindependent  partition  of  the  nodes  —  a  partition  into  two  sub¬ 
sets  Z-|  and  Z2  such  that  Z-,  is  subindependent  of  Z2.  If  we  can  find  such  Z-|  and  Z2,  then  we  can 
construct  a  valid  single  appearance  schedule  for  G  by  constructing  a  valid  single  appearance 
schedule  for  all  invocations  associated  with  Z-|  and  then  concatenating  a  valid  single  appearance 
schedule  for  all  invocations  associated  with  Z2.  By  repeatedly  applying  this  decomposition,  we 
can  construct  valid  single  appearance  schedules  whenever  they  exist  [2]. 

The  partitioning  process  on  which  this  decomposition  method  is  based  can  be  performed 
efficiently.  Given  a  nontrivial  strongly  connected  SDF  graph  G,  we  first  remove  all  arcs  from  G 
whose  delay  is  not  less  than  the  total  number  of  samples  consumed  from  the  arc  in  a  schedule 
period.  If  the  resulting  graph  G'  is  still  strongly  connected  then  no  subindependent  partition  exists. 
Otherwise,  any  root  strongly  connected  component  of  G'  is  subindependent.  This  method  of  deter¬ 
mining  a  subindependent  partition  is  illustrated  in  figure  10. 
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Fig.  10.  An  example  of  constructing  a  subindependent  partition.  For  the  strongly  connected  SDF 
graph  on  the  left,  q(A,  B,  C,  D)  =  (1,  10,  2,  20).  Thus  the  delay  on  the  arc  directed  from  D  to  B  (25) 
exceeds  the  total  number  of  samples  consumed  by  B  in  a  schedule  period  (20).  We  remove  this  arc 
to  obtain  the  new  graph  on  the  right.  Since  this  graph  is  not  strongly  connected,  a  subindependent 
partition  exists:  the  root  strongly  connected  component  (A,  B}  is  subindependent  of  the  rest  of  the 
graph  {C,  D}. 


7  Conclusion 


We  have  formally  discussed  the  organization  and  manipulation  of  loops  in  uniprocessor 
schedules  for  synchronous  dataflow  programs.  We  have  introduced  two  main  techniques:  (1)  con¬ 
structing  single  appearance  schedules,  which  permit  the  efficiency  of  inlined  code  without  requir¬ 
ing  any  code  duplication  across  multiple  invocations  of  the  same  functional  block;  and  (2) 
factoring  loops  in  a  single  appearance  schedule  to  reduce  the  amount  of  memory  required  for 
buffering.  Based  on  our  technique  for  constructing  single  appearance  schedules,  we  have  imple¬ 
mented  a  scheduling  framework  for  synthesizing  optimally  compact  programs  for  a  large  class  of 
applications.  This  framework  defines  a  way  to  incorporate  other  scheduling  objectives  in  a  man¬ 
ner  that  does  not  conflict  with  the  full  code  compaction  potential  offered  by  subindependent  parti¬ 
tions.  For  example  the  incorporation  of  techniques  to  make  buffering  more  efficient  is  discussed 
in  [2], 

The  trade-offs  involved  in  compiling  an  SDF  program  are  complex.  These  tradeoffs 
include  the  effects  of  parallelization;  code  compactness;  the  amount  of  memory  required  for  buff¬ 
ering;  the  amount  of  data  transfers  that  occur  only  through  machine  registers;  the  number  of  sub¬ 
routine  calls  and  their  associated  overhead;  the  amount  of  context- switch  overhead,  as  [20] 
addresses;  and  the  total  loop  overhead  (initiation  and  indexing).  We  have  only  begun  to  explore 
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these  tradeoffs  —  the  existing  techniques  focus  on  a  small  subset  of  the  full  range  of  consider¬ 
ations.  A  more  global  objective  of  the  formal  techniques  for  working  with  looped  schedules  that 
this  paper  presents  is  to  facilitate  the  exploration  of  tradeoffs  involved  in  compiling  SDF  pro¬ 
grams.  This  is  demonstrated  to  some  extent  by  our  scheduling  framework  [2];  there  is  much  more 
room  for  work  in  this  area. 
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Glossary 

actors( S)  The  set  of  actors  that  appear  in  the  looped  schedule  S. 
admissable  schedule  A  schedule  that  does  not  deadlock. 

c(a)  The  number  of  samples  consumed  from  SDF  arc  a  by  one  invocation  of  sink( a). 

delay( a)  The  number  of  delays  on  SDF  arc  a. 

gcd  Greatest  common  divisor. 

inv(X,  S)  The  number  of  times  that  the  looped  schedule  S  invokes  actor  or  subschedule  X. 

iterand  Given  a  schedule  loop  (n  'F1  ...  vFk),  we  refer  to  each  'Fj  as  an  iterand  of  the 

schedule  loop.  Given  a  looped  schedule  S  =  V1V2...Vk,  where  each  Vj  is  either  an 
actor  or  a  schedule  loop,  we  refer  to  each  Vj  as  an  iterand  of  S. 

iteration  count  Given  a  schedule  loop  (n  ...  vFk),  we  refer  to  n  as  the  iteration  count. 

J( S)  Given  a  periodic  schedule  S  for  a  connected  SDF  graph,  J(S)  denotes  the  blocking 

factor  of  S.  Every  periodic  schedule  invokes  each  actor  N  some  multiple  of  qG(N) 
times.  This  multiple  is  the  blocking  factor. 

Js  Given  a  periodic  schedule  S  for  an  SDF  graph  G,  Js  denotes  the  blocking  vector  of 

S.  Js  is  indexed  by  the  maximal  connected  subgraphs  of  G.  For  each  maximal  con¬ 
nected  subgraph  C,  JS(C)  =  J(restriction( S,  C)). 

looped  schedule  A  schedule  that  has  zero  or  more  parenthesized  terms  of  the  form  ( n  VF1  4^ 

. . .  T'k).  where  n  is  a  nonnegative  integer,  and  each  vFi  represents  either  an 
SDF  node  or  another  parenthesized  term,  (n  VF1  XV2  ...  vFk)  represents  the 
successive  repetition  n  times  of  the  firing  sequence  T1  T2  ...  vFk. 
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max_connected( G)  The  set  of  maximal  connected  subgraphs  of  the  SDF  graph  G. 

N(G)  The  set  of  nodes  in  the  SDF  graph  G. 

P(a,  i,  S )  The  number  of  invocations  of  the  source  actor  of  SDF  arc  a  that  precede  the  ith 

invocation  of  sink{ a)  in  schedule  S. 

p(a)  The  number  of  samples  produced  onto  SDF  arc  a  by  one  invocation  of  sourceia,). 

periodic  schedule  A  schedule  that  invokes  each  actor  at  least  once  and  produces  no  net 

change  in  the  number  of  samples  buffered  on  any  arc. 

qG  The  repetitions  vector  qG  of  the  connected  SDF  graph  G  is  a  vector  that  is  indexed 

by  the  nodes  in  G.  qG  gives  the  minimum  number  of  times  that  each  node  must  be 
invoked  in  a  periodic  schedule  for  G. 

restriction  Given  a  looped  schedule  S  and  a  set  of  actors  Z,  the  restriction  of  S  to  Z,  denoted 
restriction(S,  Z),  is  the  looped  schedule  obtained  by  removing  from  S  all  actors 
that  are  not  contained  in  Z,  and  removing  all  empty  schedule  loops  that  result. 
Also,  given  an  SDF  subgraph  G,  restriction( S,  G)  =  restriction (S,  MG)). 

single  appearance  schedule  A  looped  schedule  in  which  no  actor  appears  more  than 

once. 

The  actor  at  the  sink  of  SDF  arc  a. 

The  actor  at  the  source  of  SDF  arc  a. 

A  subgraph  of  an  SDF  graph  G  is  the  graph  formed  by  any  subset  Z  of  nodes  in  G 
together  with  all  arcs  a  in  G  for  which  source( a),  sink( a)  e  Z.  We  denote  the  sub¬ 
graph  corresponding  to  the  subset  of  nodes  Z  by  subgraph^ Z,  G),  or  simply  by  sub¬ 
graph^)  if  G  is  understood  from  context. 

A  subschedule  of  a  looped  schedule  S  is  a  sequence  of  successive  iterands  of  S  or  a 
sequence  of  successive  iterands  of  a  schedule  loop  contained  in  S. 

termination  of  a  schedule  If  S  is  not  an  admissable  schedule  then  some  invocation/in  S  is  not 

fireable  immediately  after  all  of  its  antecedents  in  the  schedule  have 
fired.  Thus/does  not  have  sufficient  data  on  at  least  one  of  its  input 
arcs.  If  a  is  one  such  input  arc,  we  say  that  S  terminates  on  a  at/. 

valid  schedule  A  schedule  that  is  a  both  periodic  and  admissable. 
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